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Abstract 

The slope of the best fit line from minimizing the sum of both the 
squared vertical errors and the squared horizontal errors is shown to be 
the root of a fourth degree polynomial. 


1 Introduction 

With simple linear regression we have data {{xi,Yi\X = Xi), ...,[xn,Yn\X = 
Xn)} and we minimize the sum of the squared vertical errors. The question 
posed here is ” Can we effectively minimize both the sum of the squared vertical 
and squared horizontal errors?” For notational convenience, we assume that the 
data is positively correlated. 

As an example, suppose we have paired data (X, Y) where we first fit a linear 
function f{x) = J/ = do + PiX to the data. For example, Y could be the grade 
point average GPA at graduation from a four year university for a student, and 
X is the corresponding SAT score before matriculation. Typically, Admissions 
Committees use such a least-squares model to measure the effectiveness of the 
SAT scores in the admission process. 

Suppose now we want to preform an inverse prediction at the value j/o to 
answer the question ’’What SAT score should an admissions candidate receive 
in order to have a predicted GPA of, say, 2.0.” This is found from the inverse 
function f~^{y) = x = y/di - do/di- 

2 Model 

For inverse prediction, we will want both f{x) and f~^{y) to ’’fit” the data, and 
we hope that the squared vertical and squared horizontal errors will both be 
small for the fitted line h{x) = /?o + PiX which has minimized both the squared 
vertical and squared horizontal errors. To that end, set 
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where 7 (0 < 7 < 1). The parameter 7 allows for a weighting of the two 
components oi SSE yielding the least square estimators for f(x) as 7 ^ 1, and 
the least square estimators of f~^{y) as 7 ^ 0 . 

We compute 
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with root 

(3o = y- Pix 

independent of 7 , the same as in simple linear regression. 
To find the slope /3i, .we compute 
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Set Sj:j; = = I]"= 1 (l/i “^^ a^d Sxy = YH=i{xr-x){,yt-y), 

and let p = Sxy/^/SxxSyy denote the correlation. 

After some manipulation, the roots of Equation0]are found by solving 

iJ^Pt - IPPl + (1 - i)pPi - (1 - 7)\/^ = 0. (5) 

y ^yy V ^xx 


The (positive) root of Equation 0 will be the slope of the line which has 
minimized the 7 -weighted sum of the squared vertical and squared horizontal 
errors. 

With 7 = 1.00, the slope /3i = p^JSyy/Sxx] with 7 = 0.00, the slope /3i = 
p)\/ Syy!Sxx', and in general, 


P'sj Syy / Sxx Si Pi S ( 1 / p) \J Syy / Sxx 
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3 An Example 

Suppose the data set is {(0,0), (0, 0), (1, 0), (1,1)} with {x = 1/2, y = 1/4, Sxx = 
I, Syy = -i/A.p = 73/3 = 0.5774}. If we choose 7 = 0.9, from Equation 0 
/3i = 0.6612; and from Equation^ /3o = 1/4— (l/2)/3i = —.08060. The bounds 
for (3i are given in © and are 1/2 < /?i < 3/2. 
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